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Abstract 

Many theories of semantic interpretation 
use A-term manipulation to composition- 
ally compute the meaning of a sentence. 
These theories are usually implemented in 
a language such as Prolog that can simulate 
A-term operations with first-order unifica- 
tion. However, for some interesting cases, 
such as a Combinatory Categorial Gram- 
mar account of coordination constructs, 
this can only be done by obscuring the un- 
derlying linguistic theory with the "tricks" 
needed for implementation. This paper 
shows how the use of abstract syntax per- 
mitted by higher-order logic programming 
allows an elegant implementation of the se- 
mantics of Combinatory Categorial Gram- 
mar, including its handling of coordination 
constructs. 

1 Introduction 

Many theories of semantic interpretation use A-term 
manipulation to compositionally compute the mean- 
ing of a sentence. These theories are usually imple- 
mented in a language such as Prolog that can sim- 
ulate A-term operations with first-order unification. 
However, there are cases in which this can only be 
done by obscuring the underlying linguistic theory 
with the "tricks" needed for implementation. For 
example, Combin atory Categorial Grammar (CCG) 
( Steedman, 199C ) is a theory of syntax and seman- 
tic interpretation that has the attractive character- 
istic of handling many coordination constructs that 
other theories cannot. While many aspects of CCG 
semantics can be reasonably simulated in first-order 
unification, the simulation breaks down on some of 
the most interesting cases that CCG can theoreti- 
cally handle. The problem in general, and for CCG 
in particular, is that the implementation language 



does not have sufficient expressive power to allow a 
more direct encoding. The solution given in this pa- 
per is to show how advances in logic programming 
allow the implementation of semantic theories in a 
very direct and natural way, using CCG as a case 
study. 

We begin by briefly illustrating why first-order 
unification is inadequate for some coordination con- 
structs, and then review two proposed solutions. 
The sentence in (la) usually has the logical form 
(LF) in (lb). 

(la) John and Bill run. 

(lb) (and (run John) (run Bill)) 

CCG is one of several theories in which (lb) gets 
derived by raising John to be the LF XP.(P john), 
where P is a predicate that takes a NP as an argu- 
ment to return a sentence. Likewise, Bill gets the 
LF \P.(P bill), and coordination results in the fol- 
lowing LF for John and Bill: 



(2) AP.(and (P john) (P bill)) 

When (2) is applied to the predicate, (lb) will re- 
sult after /3-reduction. However, under first-order 
unification, this needs to simulated by having the 
variable x in Xx.run(x) unify both with Bill and 
John, and this is not possible. See (Jowsey, 199C) 



and (Moore, 1989) for a thorough discussion. 

(Moore, 1989) suggests that the way to overcome 
this problem is to use explicit A-terms and encode 
/^-reduction to perform the needed reduction. For 
example, the logical form in (3) would be produced, 
where X\run(X) is the representation of Ax. run (x). 

(3) and (apply (X\run (X) , john) , 

apply (X\run(X) ,bill)) 

This would then be reduced by the clauses for apply 
to result in (lb). For this small example, writing 
such an apply predicate is not difficult. However, 



as the semantic terms become more complex, it is 
no trivial matter to write /3-reduction that will cor- 
rectly handle variable capture. Also, if at some point 
it was desired to determine if the semantic forms of 
two different sentences were the same, a predicate 
would be needed to compare two lambda forms for 
a-equivalence, which again is not a simple task. Es- 
sentially, the logic variable X is meant to be inter- 
preted as a bound variable, which requires an addi- 
tional layer of programming. 



(Park, 1992) proposes a solution within first-order 
unification that can handle not only sentence (la), 
but also more complex examples with determiners. 
The method used is to introduce spurious bindings 
that subsequently get removed. For example, the 
semantics of (4a) would be (4b), which would then 
get simplified to (4c). 

(4a) A farmer and every senator talk 



Function Application (>): 
X/Y : F Y : y => X : Fy 

Function Application (<): 
Y : y X\Y : F => X : Fy 

Function Composition (> B): 
X/Y : F Y/Z : G => X/Z : Ax.F(Gx) 

Function Composition (< B): 
Y\Z : G X\Y : F => X\Z : Ax.F(Gx) 

Type Raising (> T): 

np : x => s/(s\np) : AF.Fx 

Type Raising (< T): 

np : x => s\(s/np) : AF.Fx 

Figure 1: CCG rules 



(4b) exists (XI, farmer (XI) 

&(exists(X2, (X2=Xl)&talk(X2)))) 

&forall(X3, senator (X3) 

=> (exists (X2 , (X2=X3)&talk(X2) ) ) ) 

(4c) exists(Xl,farmer(Xl)&talk(Xl)) 

&f orall (X3 , senator (X3) =>talk (X3) ) 

While this pushes first-order unification beyond 
what it had been previously shown capable of, there 
are two disadvantages to this technique: (1) For ev- 
ery possible category that can be conjoined, a sepa- 
rate lexical entry for and is required, and (2) As the 
conjoinable categories become more complex, the 
and entries become correspondingly more complex 
and greatly obscure the theoretical background of 
the grammar formalism. 

The fundamental problem in both cases is that the 
concept of free and bound occurrences of variables 
is not supported by Prolog, but instead needs to 
be implemented by additional programming. While 
theoretically possible, it becomes quite problematic 
to actually implement. The solution given in this 
paper is to use a higher-order logic programming 
language, AProlog, that already impl ements these 
concepts, called "abstract syntax" in ([Miller, 1991 ) 
and "higher -order abstract syntax" in ( Pfenning and 
Elliot, 198S ) . This allows a natural and elegant im- 



plementation of the grammatical theory, with only 
one lexical entry for and. This paper is meant to be 
viewed as furthering the exploration of the utility of 
higher-order logic programming for computational 
linguistics - see, for example, (Miller & Nadathur 



2 CCG 

CCG is a grammatical formalism in which there is 
a one-to-one correspondence between the rules of 
composition^] at the level of syntax and logical form. 
Each word is (perhaps ambiguously) assigned a cat- 
egory and LF, and when the syntactical operations 
assign a new category to a constituent, the corre- 
sponding semantic operations produce a new LF for 
that constituent as well. The CCG rules shown in 
Figure 1 are implemented in the system described 
in this paper .0 ^ Each of the three operations have 
both a forward and backward variant. 

As an illustration of how the semantic rules can 
be simulated in first-order unification, consider the 
derivation of the constituent harry found, where 
harry has the category np with LF harry ' and found 
is a transitive verb of category (s\np) /np with LF 

(5) Xobject.Xsubject. (found' subject object) 

In the CCG formalism, the derivation is as fol- 
lows: harry gets raised with the > T rule, and 
then forward composed by the > B rule with found, 
and the result is a category of type s/np with LF 



1986), (Pareschi, 1989), and (Pereira, 199C) 



1 In the general sense, not specifically the CCG rule 
for function composition. 

2 The type-raising rules shown are actually a simplifi- 
cation of what has been implemented. In order to handle 
determiners, a system similar to NP -complement, cate- 
gories as discussed in ( Dowty, 1988|) is used. Although 
a worthwhile further demonstration of the use of ab- 
stract syntax, it has been left out of this paper for space 
reasons. 

3 The \ for a backward-looking category should not 
be confused with the \ for A-abstraction. 



harry 



found 



->T 



S:s/(S:s\NP: harry') 

(S:found' npl np2\NP:npl)/NP:np2 

S:found' harry' np2/NP:np2 



->B 



Figure 2: CCG derivation of harry found simulated 
by first-order unification 

Xx. (found' harry' x). In section 3 it will be seen 
how the use of abstract syntax allows this to be ex- 
pressed directly. In first-order unification, it is sim- 
ulated as shown in Figure 2.^ 

The final CCG rule to be considered is the coor- 
dination rule that specifies that only like categories 
can coordinate: 

(6) X conj X => X 

This is actually a schema for a family of rules, col- 
lectively called "generalized coordination" , since the 
semantic rule is different for each case.^ For exam- 
ple, if X is a unary function, then the semantic rule is 
(7a), and if the functions have two arguments, then 
the rule is (7b) .0 

(7a) <i>FGH = \x.F{Gx){Hx) 

(7b) <f> 2 FGH = Xx.Xy.F{Gxy){Hxy) 

For example, when processing (la), rule (7a) would 
be used with: 

• F — Ax. Ay. (and' x y) 

• G = XP.(P john') 

• H = XP.(P bill') 
with the result 

(j)FGH = Ax. (and' (x john') (x bill')) 
which is a-equivalent to (2). 



4 example adapted from (Steedman, 1990, p. 220). 

5 It is not established if this schema sho uld actually 
prod uce an unbounded fam ily of rules. See ( Weir, 1988 ) 
and (Weir and Joshi, 1988) for a discussion of the im- 
plications for automata-theoretic po wer of genera lized 
coordination and composition, and (Glazdar, 1988) for 
linguistic arguments t hat languages lik e Dutch may re- 
quire this power, and (Steedman, 1990) for some further 
discussion of the issue. In this paper we use the general- 
ized rule to illustrate the elegance of the representation, 
but it is an easy change to implement a bounded coor- 
dination rule. 

6 The $ notation is used because of the combina - 
tory logic background of CCG. See ( Steedman, 199o| ) 
for details. 
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Figure 3: Declarations for AProlog representation of 
CCG logical forms 

3 APROLOG and Abstract Syntax 

AProlog is a logic programming language based on 



higher-order hereditary Harrop formulae (Miller ct 



al., 1991). It differs from Prolog in that first-order 
terms and unification arc replaced with simply-typed 
A-terms and higher-order unification^, respectively. 
It also permits universal quantification and implica- 
tion in the goals of clauses. The crucial aspect for 
this paper is that together these features permits the 
usage of abstract syntax to express the logical forms 
terms computed by CCG. The built-in A-term ma- 
nipulation is used as a "meta-language" in which the 
"object-language" of CCG logical forms is expressed, 
and variables in the object-language are mapped to 
variables in the meta-language. 

The AProlog code fragment shown in Figure 3 de- 
clares how the CCG logical forms are represented. 
Each CCG LF is represented as an untyped A-term, 
namely type tm. abs represents object-level abstrac- 
tion Xx.M by the meta-level expression (abs N), 
where N is a meta-level function of type tm — > tm. 
A meta-level A-abstraction Xy.P is written y\P.[] 

Thus, if walked' has type tm — » tm, then 
y\(walked' y) is a AProlog (meta-level) function 
with type tm — > tm, and (abs y\(walked' y)) is the 
object-level representation, with type tm. The LF 
for found shown in (5) would be represented as 
(abs obj\(abs sub\ (found' sub obj))). app en- 
codes application, and so in the derivation of harry 
found, the type-raised harry has the AProlog value 
(abs p\(app p harry'))|| 

7 defined as the unification of simply typed A-terms, 
modulo /3n conversion. 

8 T his is the sa me syntax for A-abstraction as in 
(3). (Moore, 198£) in fact borrows the notation for A- 
abstraction from AProlog. The difference, of course, is 
that here the abstraction is a meta-level, built-in con- 
struct, while in (3) the interpretation is dependent on an 
extra layer of programming. Bound variables in AProlog 
can be either upper or lower case, since they are not logic 
variables, and will be written in lower case in this paper. 

9 It is possible to represent the logical forms at the 



type apply tm 
type compose tm 
type raise tm 



-> tm -> tm -> o. 
-> tm -> tm -> o. 
-> tm -> o. 



apply (abs R) S (R S) . 

compose (abs F) (abs G) (abs x\(F (G x))). 
raise Tm (abs P\(app P Tm)). 

Figure 4: AProlog implementation of CCG logical 
form operations 

The second part of Figure 3 shows declares how 
quantifiers are represented, which are required since 
the sentences to be processed may have determiners, 
f orall and exists are encoded similarly to abstrac- 
tion, in that they take a functional argument and 
so object-level binding of variables by quantifiers is 
handled by meta-level A-abstraction. >> and kk arc 
simple constructors for implication and conjunction, 
to be used with f or all and exists respective ly, in 
the typical manner ( Pereira and Shieber, 1987 ) . For 
example, the sentence every man found a bone has as 
a possible LF (8a), with the AProlog representation 
(8b0 

(8a) 3.T.((bone' x) A Vy.((man' y) -> (found' y x))) 

(8b) (exists x\ 

((bone' x) kk 
(forall xl\ 

((man' xl) » (found' xl x))))) 

Figure 4 illustrates how directly the CCG opera- 
tions can be encodedpl o is the type of a meta-level 

object-level without using abs and app, so that harry 
could be simply p\(p harry'). The original implemen- 
tation of this system was in fact done in this manner. 
Space prohibits a full explanation, but essentially the 
fact that AProlog is a typed language leads to a good 
deal of formal clutter if this method is used. 

10 The LF for the determiner has the form of a Mon- 
tagovian generalized quantifier, giving rise to one fully 
scoped logical form for the sentence. It should be 
stressed that this particular kind of LF is assumed here 
purely for the sake of illustration, to make the point that 
composition at the level of derivation and LF are one- 
to-one. Section 4 contains an example for which such a 
derivation fails to yield all available quantifier scopings. 
We do not address here the further question of how the 
remaining scoped readings are derived. Alternatives that 
appear compati ble with the present app roach are quanti- 
fier movement (|Hobbs fc Shieber, 1987| ), type -raising at 
LF (Partee fc Rootli, 1984), or the use of disambiguated 



quantifers in the derivation itself (Park, 1995). 

11 There are other clauses, not shown here, that deter- 
mine the direction of the CCG rule. For either direction, 
however, the semantics are the same and both directional 
rules call these clauses for the semantic computation. 



proposition, and so the intended usage of apply is 
to take three arguments of type tm, where the first 
should be an object-level A-abstraction, and set the 
third equal to the application of the first to the sec- 
ond. Thus, for the query 

?- apply (abs sub\ (walked' sub)) harry' M. 

R unifies with the tm — > tm function 
sub\(walked' sub), S with harry' and M with (R 
S) , the meta-level application of R to S, which by the 
built-in /3-reduction is (walked' harry'). In other 
words, object-level function application is handled 
simply by the meta-level function application. 

Function composition is similar. Consider 
again the derivation of harry found by type- 
raising and forward composition. harry would 
get type-raised by the raise clause to produce 
(abs p\(app p harry')), and then composed with 
found, with the result shown in the following query: 

?- compose (abs p\(app p harry')) 
(abs obj\ 
(abs sub\ 

(found' sub obj))) 
M. 

M = (abs x\ 
(app 

(abs sub\ (found' sub x)) 
harry ' ) ) . 

At this point a further /3-reduction is needed. Note 
however this is not at all the same problem of 
writing a /3-reducer in Prolog. Instead it is a 
simple matter of using the meta-level /3-reduction 
to eliminate /3-redexes to produce the final result 
(abs x\(found' harry x)). We won't show the 
complete declaration of the /3-reducer, but the key 
clause is simply: 

red (app (abs M) N) (M N) . 

Thus, using the abstract syntax capabilities of 
AProlog, we can have a direct implementation of the 
underlying linguistic formalism, in stark contrast to 
the first-order simulation shown in Figure 2. 

4 Implementation of Coordination 

A primary goal of abstract-syntax is to support re- 
cursion through abstractions with bound variables. 
This leads to the interpretation of a bound variable 
as a "scoped constant" - it acts like a constant that 
is not visible from the top of the term, but which 
becomes visible during the descent through the ab- 
straction. See (Miller, 1991) for a discussion of how 



this may be used for evaluation of functional pro- 
grams by "pushing" the evaluation through abstrac- 
tions to reduce redexes that are not at the top-level. 
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Figure 5: Implementation of the CCG category sys- 
tem 

type coord 

cat -> tm -> tm -> tm -> o. 

coord (fs A B) (abs R) (abs S) (abs T) :- 
pi x\ (coord B (R x) (S x) (T x)). 

coord (bs A B) (abs R) (abs S) (abs T) :- 
pi x\ (coord B (R x) (S x) (T x)). 

coord B R S (and' R S) :- atomic-type B. 

Figure 6: Implementation of coordination 

This technique is also used in the /3-reducer briefly 
mentioned at the end of the previous section, and 
a similar technique will be used here to implement 
coordination by recursively descending through the 
two arguments to be coordinated. 

Before describing the implementation of coordi- 
nation, it is first necessary to mention how CCG 
categories are represented in the AProlog code. As 
shown in Figure 5, cat is declared to be a primi- 
tive type, and np, s, conj, noun are the categories 
used in this implementation, f s and bs are declared 
to be constructors for forward and backward slash. 
For example, the CCG category for a transitive verb 
(s\np)/np would be represented as (fs np (bs np 
s) ) . Also, the predicate atomic-type is declared to 
be true for the four atomic categories. This will be 
used in the implementation of coordination as a test 
for termination of the recursion. 

The implementation of coordination crucially uses 
the capability of AProlog for universal quantification 



in the goal of a clause, pi is the meta-level operator 
for V, and Vx.M is written as pi x\M. The oper- 
ational semantics for AProlog state that pi x\G is 
provable if and only if [c/xJG is provable, where c is 
a new variable of the same type as x that does not 
otherwise occur in the current signature. In other 
words, c is a scoped constant and the current signa- 
ture gets expanded with c for the proof of [c/x]G. 
Since c is meant to be treated as a generic place- 
holder for any arbitrary x of the proper type, c must 
not appear in any terms instantiated for logic vari- 
ables during the proof of [c/x]G. The significance of 
this restriction will be illustrated shortly. 

The code for coordination is shown in Figure 
6. The four arguments to coord are a category 
and three terms that are the object-level LF rep- 
resentations of constituents of that category. The 
last argument will result from the coordination of 
the second and third arguments. Consider again 
the earlier problematic example (la) of coordina- 
tion. Recall that after john is type-raised, its LF 
will be (abs p\(app p john')) and similarly for bill. 
They will both have the category (f s (bs np s) 
s). Thus, to obtain the LF for John and Bill, the 
following query would be made: 

?- coord (fs (bs np s) s) 

(abs p\(app p john')) 
(abs p\(app p bill')) 
M. 

This will match with the first clause for coord, with 

• A instantiated to (bs np s) 

• B to s 

• R to (p\(app p john')) 

• S to (p\(app p bill')) 

• and T a logic variable waiting instantiation. 

Then, after the meta-level /3-reduction using the new 
scoped constant c, the following goal is called: 

?- coord s (app c john') (app c bill') N. 

where N = (T c). Since s is an atomic type, the 
third coord clause matches with 

• B instantiated to s 

• R to (app c john' ) 

• S to (app c bill') 

• N to (and' (app c john') (app c bill')) 



Since N = (T c) , higher-order unification is used by 
AProlog to instantiate T by extracting c from N with 
the result 

T = x\(and (app x john') (app x bill')) 

and so M from the original query is 

(abs x\(and' (app x john') (app x bill'))) 

Note that since c is a scoped constant arising from 
the proof of an universal quantification, the instan- 
tiation 

T = x\(and' (app c john') (app x bill')) 

is prohibited, along with the other extractions that 
do not remove c from the body of the abstraction. 

This use of universal quantification to extract out 
c from a term containing c in this case gives the same 
result as a direct implementation of the rule for coo- 
ordination of unary functions (7a) would. However, 
this same process of recursive descent via scoped 
constants will work for any member of the conj rule 
family. For example, the following query 

?- coord 

(fs np (bs np s)) 

(abs obj\(abs sub\(like' sub ob j ) ) ) 
(abs obj\(abs sub\(hate' sub ob j ) ) ) 
M. 

M = (abs x\ 

(abs xl\ 
(and' (like' xl x) 

(hate' xl x)))) . 

corresponds to rule (7b). Note also that the use 
of the same bound variable names obj and sub 
causes no difficulty since the use of scoped-constants, 
meta-level /3-reduction, and higher-order unification 
is used to access and manipulate the inner terms. 
Also, whereas (Park, 1992) requires careful consider- 
ation of handling of determiners with coordination, 
here such sentences are handled just like any others. 
For example, the sentence Mary gave every dog a 
bone and some policeman a flower results in the LF 
0: 

(and' 

(exists x\((bone' x) && 
(forall xl\((dog' xl) 

>> (gave' mary' x xl))))) 
(exists x\( (flower' x) && 

(exists xl\ ( (policeman' xl) 

&& (gave' mary' x xl)))))) 

12 This is a case in which the particular LF assumed 
here fails to yield another available scoping. See foot- 
note Fol 



Thus, "generalized coordination" , instead of being a 
family of separate rules, can be expressed as a sin- 
gle rule on recursive descent through logical forms. 
( Bteedman, 1990| ) also discusses "generalized com- 
position" , and it may well be that a similar imple- 
mentation is possible for that family of rules as well. 



5 Conclusion 



We have shown how higher-order logic programming 
can be used to elegantly implement the semantic the- 
ory of CCG, including the previously difficult case 
of its handling of coordination constructs. The tech- 
niques used here should allow similar advantages for 
a variety of such theories. 

An argument can be made that the approach 
taken here relies on a formalism that entails im- 
plementation issues that are more difficult than for 
the other solutions and inherently not as efficient. 
However, the implementation issues, although more 
complex, are also well-understood and it can be ex- 
pected that future work will bring further improve- 
ments. For example, it is a straightforward matter 
to transform the AProlog code into a logic called L\ 
( Miller, 1990| ) which requires only a restricted form 
of unification that is decidable in linear time and 
space. Also, the declarative nature of AProlog pro- 
grams opens up the possibility for applications of 
program transformations such as partial evaluation. 
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